Partially identified prevalence estimation under misclassification using the kappa coefficient

نویسندگان

  • Helmut Küchenhoff
  • Thomas Augustin
  • Anne Kunz
چکیده

Many empirical studies are confronted with severe misclassification error. While interest is in the realizations of a variable Y (e.g. disease yes/no), only a related variable Y ∗ (e.g. the outcome of a diagnostic test) can be observed. Even in the case of simple prevalence estimation, naive estimation, neglecting the fundamental difference between Y and Y ∗, may be considerably biased ([1], [2]). If sensitivity and specificity of the measurement instrument, and therefore the misclassification probabilities, are known then powerful correction method granting consistent estimation exist. On the other hand, without any knowledge on the misclassification process, the results simply remain vacuous. We present a corrected prevalence estimator when some knowledge on the misclassification process is available from repeated measurements. In that setting we consider the situation, where, as is quite common in medical studies, kappa, the coefficient of inter-rater agreement is given. Then the problem is still unidentified from the viewpoint of classical statistics, but relying on the concepts of partial identification and systematic sensitivity analysis in the spirit of [3] and [4] gives nevertheless valuable insight into prevalence estimation. We derive tight identification regions as interval-valued point estimators for the true prevalence, and then derive appropriate confidence regions relying on arguments from [4] and [5]. Our results are illustrated by several theoretical scenarios and with data from an oral health study on prevalence of caries in children in Belgium.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Assessing the Accuracy of Satellite Image Classifications for Pollutant Loadings Estimation

We examined the accuracy measures to evaluate maps created from knowledge-based classifications of remotely sensed data. The automated classifications involved categories that showed different levels of annual loadings of six pollutants. From the classification error matrices that used spectral information and ancillary data, we computed the overall accuracy and the kappa coefficients. These co...

متن کامل

Prediction of potential habitat distribution of Artemisia sieberi Besser using data-driven methods in Poshtkouh rangelands of Yazd province

The present study aimed to model potential habitat distribution of A. sieberi, and its ecological requirements using generalized additive model (GAM) and classification and regression tree (CART) in in the Poshtkouh rangelands of Yazd province. For this purpose, pure habitats of the species was delineated and the species presence data was recorded by the systematic-randomize sampling method. Us...

متن کامل

Value at Risk Estimation using the Kappa Distribution with Application to Insurance Data

The heavy tailed distributions have mostly been used for modeling the financial data. The kappa distribution has higher peak and heavier tail than the normal distribution. In this paper, we consider the estimation of the three unknown parameters of a Kappa distribution for evaluating the value at risk measure. The value at risk (VaR) as a quantile of a distribution is one of the import...

متن کامل

Validity of anthropometric equations for the estimation of muscle mass in the elderly Validade de equações antropométricas para estimar a massa muscular em idosos

The objective of this study was to compare skeletal muscle mass (SMM) in older adults as estimated by dual-energy X-ray absorptiometry (DXA) and three predictive equations. A total of 180 older adults (120 women and 60 men) aged 60 to 81 years were studied. Appendicular SMM was measured by DXA and estimated using predictive equations based on anthropometric measures, age, race, and gender. Agre...

متن کامل

An assessment of self-reported disease classification in epidemiological studies of dry eye.

PURPOSE To evaluate the reliability of self-reported dry eye disease status and patient-related predictors of misclassification in contact lens wearers. METHODS Patients completed the Contact Lens Dry Eye Questionnaire (CLDEQ) short form on two occasions. Test-retest reliability of the CLDEQ composite score was determined using the 95% limits of agreement (LoA) and an intraclass correlation c...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Int. J. Approx. Reasoning

دوره 53  شماره 

صفحات  -

تاریخ انتشار 2012